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BAYESIAN LATENT PATTERN MIXTURE MODELS FOR 
HANDLING ATTRITION IN PANEL STUDIES WITH 
REFRESHMENT SAMPLES 

By Yajuan Si* , Jerome P. ReiterI^ and D. Sunshine HillygusI^ 
University of Wisconsin-Madison* and Duke University ^ 

Many panel studies collect refreshment samples—new, randomly 
sampled respondents who complete the questionnaire at the same 
time as a subsequent wave of the panel. With appropriate modeling, 
these samples can be leveraged to correct inferences for biases caused 
by non-ignorable attrition. We present such a model when the panel 
includes many categorical survey variables. The model relies on a 
Bayesian latent pattern mixture model, in which an indicator for 
attrition and the survey variables are modeled jointly via a latent 
class model. We allow the multinomial probabilities within classes to 
depend on the attrition indicator, which offers additional flexibility 
over standard applications of latent class models. We present results 
of simulation studies that illustrate the benefits of this flexibility. We 
apply the model to correct attrition bias in an analysis of data from 
the 2007-2008 Associated Press/Yahoo News election panel study. 


1. Introduction. Many longitudinal or panel surveys, in which the 
same individuals are interviewed repeatedly at different points in time, suf¬ 
fer from panel attrition. For example, in the American National Election 
Study, 47% of respondents who completed the hrst wave in January 2008 
failed to complete the follow-up wave in June 2010. Such attrition can result 
in biased inferences when the attrition generates non-ignorable missing data; 
that is, the reasons for attrition depend on values of unobserved variables 
(e.g., Schluchte, 1982; Brown, 1990; Diggle and Kenward, 1994; Ibrahim, 
Lipsitz and Chen, 1999; Scharfstein, Rotnitzky and Robins, 1999; Olsen, 
2005; Behr, Bellgardt and Rendtel, 2005; Bhattacharya, 2008; Hogan and 
Daniels, 2008). 

Unfortunately, it is not possible to determine whether the attrition is 
ignorable or non-ignorable, nor the extent to which attrition impacts infer¬ 
ences, using the collected data alone. Consequently, analysts have to rely 
on strong and generally unverihable assumptions about the attrition pro¬ 
cess. Many assume that attrition is a missing at random (MAR) process; 
for example, MAR assumptions underlie the use of post-stratification to ad- 
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just survey weights (e.g., Holt and Smith, 1979; Gelman and Carlin, 2001; 
Henderson, Hillygus and Tompson, 2013) and off-the-shelf multiple imputa¬ 
tion routines to create completed datasets (e.g., Pasek et ah, 2009; Honaker 
and King, 2010). Others allow for specific not missing at random (NMAR) 
processes, characterizing the attrition with a selection model (Hausman 
and Wise, 1979; Brehm, 1993; Kenward, 1998; Scharfstein, Rotnitzky and 
Robins, 1999) or pattern mixture model (Little, 1993, 1994; Daniels and 
Hogan, 2000; Roy, 2003; Kenward, Molenberghs and Thijs, 2003; Lin, Mc¬ 
Culloch and Rosenheck, 2004; Roy and Daniels, 2008). 

Many panel surveys supplement the original panel with refreshment sam¬ 
ples. These are cross-sectional, random samples of new respondents given 
the questionnaire at the same time as a subsequent wave of the panel. For 
example, refreshment samples are included in the National Educational Lon¬ 
gitudinal Study of 1988, which followed a nationally representative sample 
of 21,500 eighth graders in two year intervals until 2000 and refreshed with 
cross-sectional samples in 1990 and 1992. Overlapping or rotating panels, in 
which a new study cohort completes their first wave at the same time a pre¬ 
vious cohort completes a second or later wave, offer equivalent information. 

Refreshment samples offer information that can be utilized to correct 
inferences for non-ignorable panel attrition (Hirano et ah, 1998; Bartels, 
1999; Hirano et ah, 2001; Sekhon, 2004; Bhattacharya, 2008; Deng et ah, 
2013). In particular, analysts can use an additive non-ignorable (AN) model, 
which comprises a model for the survey variables coupled with a selection 
model for the attrition process (Hirano et ah, 1998). The selection model 
must be additive in the variables observed and missing due to attrition so 
that model parameters are identifiable. 

Specifying the models for the survey variables and the attrition indicator 
can be challenging, even when the data include only a modest number of 
variables. Consider, for example, a multinomial survey outcome modeled as 
a function of ten categorical predictors. It is difficult to determine which 
interaction terms to include in the model, especially in the presence of miss¬ 
ing data due to attrition (Erosheva, Fienberg and Junker, 2002; Vermunt 
et ah, 2008; Si and Reiter, 2013). The model specification task is even more 
complicated when the analyst seeks to model all survey variables jointly, for 
example, with a log-linear model or sequence of conditional models (e.g., 
specify /(a), then f{b \ a), then /(c | a,b), and so on). Joint modeling can 
be useful when the survey variables suffer from item nonresponse. 

Recognizing this. Si, Reiter and Hillygus (2014) propose to use a Dirichlet 
process mixture of products of multinomial distributions (Dunson and Xing, 
2009; Si and Reiter, 2013) to model the survey variables. This offers the ana- 
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lyst the potential to capture complex dependencies among variables without 
selecting interaction effects, as well as to handle item nonresponse among the 
survey variables. However, for the attrition indicator model. Si, Reiter and 
Hillygus (2014) use probit regression with only main effects for the survey 
variables, eschewing the task of selecting interaction effects. While conve¬ 
nient, using a main-effects-only specification makes assumptions about the 
attrition mechanism that may not be realistic in practice. Furthermore, pro¬ 
bit regressions can suffer from the effects of separability and near co-linearity 
among predictors (Gelman et ah, 2008), which complicates estimation of the 
AN model. 

In this article, we present an alternative approach for leveraging refresh¬ 
ment samples based on Bayesian latent pattern mixture (BLPM) models. We 
focus on models for categorical variables. The key idea is to use the Dirich- 
let process mixture of products of multinomial distributions for the survey 
variables and attrition indicator jointly, thus avoiding specification of an 
explicit selection model. We note that several other authors (e.g., Muthen, 
Jo and Brown, 2003; Roy, 2003; Lin, McCulloch and Rosenheck, 2004) have 
proposed using mixture models for handling attrition outside of the con¬ 
text of refreshment samples. As we show, the refreshment sample enables 
us to allow the multinomial vectors within mixture components to depend 
on attrition indicators, thereby encoding a flexible imputation engine that 
reduces reliance on conditional independence assumptions. 

We were motivated by attrition in the Associated Press/Yahoo 2008 Elec¬ 
tion Panel (APYN) study, a multi-wave longitudinal survey designed to track 
the attitudes and opinions of the American public during the 2008 presiden¬ 
tial election campaign. The APYN study was the basis of dozens of news 
stories during the campaign and subsequent academic analyses of the elec¬ 
tion in the years since. However, the study lost more than one third of the 
original sample to attrition by the final wave of data collection, calling into 
question the accuracy of analyses based on the complete cases. The APYN 
included a refreshment sample in the hnal pre-election wave of data collec¬ 
tion, which we leverage via the BLPM model to create attrition-adjusted, 
multiply imputed datasets. We use the multiply imputed data to examine 
dynamics of public opinion in the 2008 presidential campaign. 

The remainder of the article is organized as follows. In Section 2, we in¬ 
troduce the APYN data. In Section 3, we describe pattern mixture models 
for refreshment samples, including conditions under which model parame¬ 
ters are data-identified. To our knowledge, this is the first description of 
pattern mixture models in this context. In Section 4, we propose and mo¬ 
tivate the BLPM model for refreshment sample contexts. In Section 5, we 
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Table 1 

APYN variables from wave 1 (Wl), wave 2 (W2) and refreshment sample (Ref), with 
rates of item nonresponse. Item nonresponse arises either from refusals to answer the 
question (respondent proceeded to the next question without giving a response) or 
selection of a “Don’t know enough to say” response. We note that 1,011 of the wave 1 
participants attrited from the panel by wave 2, which could result in attrition bias. 


Variable 

Levels 

Item nonresponse counts (%) 

Wl: 2735 W2: 1724 Ref: 464 

Obama favorability 

2 

550 (20.1) 

95 (5.5) 

20 (4.3) 

Party identification (Dem., Rep., Ind.) 

3 

13 (0.5) 


9 (1.9) 

Ideology (Lib., Mod., Con.) 

3 

57 (2.1) 


10 (2.2) 

Age (18-29, 30-44, 45-59, 60-b) 

4 

0 


0 

Education (< HS, Some coll.. Coll.) 

3 

0 


0 

Race (White, Non-white) 

2 

0 


0 

Gender 

2 

0 


0 

Income (Ks) (<30, 30-50, 50-75, >75) 

4 

0 


0 

Married indicator 

2 

0 


0 


illustrate properties of the BLPM model with simulation studies. Here, we 
demonstrate the benehts of allowing the multinomial vectors within mix¬ 
ture components to depend on attrition indicators. In Section 6, we analyze 
the American electorate in the 2008 presidential election, using the BLPM 
model to account for attrition in the APYN data. Finally, in Section 7 we 
summarize and discuss future research directions. 

2. Description of APYN Data. The APYN study included eleven 
waves of data collection and three refreshment samples spanning the 2008 
primary and general U.S. election season. The survey was sampled from the 
GfK Knowledge Panel, which is one of the nation’s only online, probability- 
based respondent pools designed to be statistically representative of the U.S. 
population. The respondent pool is recruited via a probability-based sam¬ 
pling method using published sampling frames that cover 96% of the U.S. 
population. Sampled non-internet households are provided with a laptop and 
free internet service. Individuals in the respondent pool are then invited to 
participate in online surveys, such as the APYN panel survey. Surveys from 
the GfK KnowledgePanel are approved by the Office of Management and 
Budget for government research and have been used in hundreds of academic 
publications spanning diverse disciplines, including health and medicine, 
psychology, social sciences, public policy, and survey and statistical method¬ 
ology. More information about the survey methodology can be found at 
http://www.knowledgenetworks.com/ganp/election2008/index.html. 

Wave 1 of the APYN was fielded on November 2, 2007 and was com¬ 
pleted by 2,735 respondents out of 3,548 contacted individuals. After the 
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initial wave, these wave 1 respondents were invited to participate in each 
follow-up wave, even if they failed to respond to the previous one. Con¬ 
sequently, wave-to-wave attrition rates or completion rates vary across the 
study. Three external refresh cross-sections were also collected: a sample of 
697 new respondents in January, 576 new respondents in September, and 464 
new respondents in October. Each of the refreshment samples is a random 
and cross-sectional sample of the GfK respondent pool. Our analysis focuses 
on wave 1 (November 2007) and the ninth wave with a corresponding re¬ 
freshment sample (October 2008, the final wave before the election), which 
we label wave 2 for presentational clarity. As shown in Table 1 , of those who 
completed wave 1, 1,011 (37%) respondents failed to complete the October 
wave. In previous research using the APYN data (Pasek et ah, 2009; Hender¬ 
son and Hillygus, 2011; Iyengar, Sood and Lelkes, 2012; Henderson, Hilly- 
gus and Tompson, 2013), scholars have mostly relied on post-stratification 
weights to correct for potential panel attrition bias, although Pasek et al. 
(2009) used standard multiple imputation via Amelia II (King et ah, 2001). 
Deng et al. (2013) outline the limitations of such approaches—both assume 
that the attrition is MAR. 

The primary outcome of interest in pre-election polls tends to be evalu¬ 
ations of the candidates, as analysts attempt to gauge levels of candidate 
support within the electorate. Which candidate is most likely to win the 
election? Who in the electorate supports each side? Because the earliest 
waves of the APYN took place before the ballot match-up was known—i.e., 
before Obama and McCain had been selected as their party nominees—we 
focus on Obama favorability (coded as favorable or not). This variable offers 
exact comparability in question wording across survey waves and is highly 
correlated with eventual vote choice (the tetrachoric correlation of the items 
in wave 2 is 0.97). In examining Obama favorability, we consider standard 
covariates from the voting behavior literature. These include demographic 
variables (from “Age” to “Marital status” in Table 1) previously shown to be 
related to candidate evaluations and/or panel attrition (Frankel and Hilly¬ 
gus, 2013).^ We also consider two relevant political background variables 
(“Party identification” and “Ideology” in Table 1) that are typically consid¬ 
ered time invariant in the context of a single election cycle (Bartels et al., 
2011 ). 


^Demographic and political profile variables are collected in profile surveys when a pan¬ 
elist joins the KnowledgePanel and are updated continually; thus, they have few missing 
values for any one study. 
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Table 2 

Structure of panel and refreshment samples. Notation for sample sizes in parentheses. 
The total number of individuals in both datasets is N = Np + Nr. 


Panel (Np) 
Refreshment Sample (Nr) 


3. Additive Pattern Mixture Models for Refreshment Samples. 

Before introducing the BLPM model and analyzing the APYN data, we 
review the AN model of Hirano et al. (1998) and present a corresponding 
pattern mixture model formulation. Suppose the data comprise a two wave 
panel of Np individuals with a refreshment sample of Np new individuals 
in the second wave. For all N = Np + Np individuals, the data include qo 
time-invariant variables X = (Ai,..., Ag^), such as demographic or frame 
variables. Let Yi = (Yu,..., ligj be the qi survey variables of interest 
collected in wave 1. Let Y 2 = (Y 21 ,..., ^ 292 ) be the corresponding q 2 survey 
variables collected in wave 2. Here, we assume that Yf and Y 2 comprise the 
same variables collected at different waves, although this is not necessary. 
Among the Np individuals, Ncp < Np provide at least some data in the 
second wave, and the remaining Nip = Np — N^p individuals drop out of 
the panel. The refreshment sample includes only (A, Y 2 ); by design, Yi are 
missing for all the individuals in the refreshment sample. In this section, we 
presume that A, Yi in the panel, and Y 2 in the refreshment sample are not 
subject to nonresponse, although we relax this when analyzing the APYN. 

For each individual i = 1,..., A, let Wi = 1 if individual i would remain 
in wave 2 if included in wave 1, and let Wi = 0 if individual i would drop out 
of wave 2 if included in wave 1. Here, Wi is an indicator of panel attrition 
conditional on participation in wave 1; it is not an indicator of item or unit 
nonresponse among individuals in the refreshment sample. We note that 
Wi is fully observed for all individuals in the panel but is missing for the 
individuals in the refreshment sample, since individuals in the refreshment 
sample are not provided the chance to respond in wave 1. Putting it all 
together, the concatenated data have the structure illustrated in Table 2. 

The AN model requires a joint model for (Yi,Y 2 | A) and a selection 
model for {W \ A, Yi,Y 2 ), that is, 

(Yi,Y2)|A ~ /(Yi,Y2|A,0) 

(3.1) IF|yi,Y2,A ~ /(IF|A,Yi,Y2,0), 

where 0 generically represents the parameters for both models. To enable 
identification, (3.1) must exclude interactions between Yi and Y 2 - 


Time-Invariant 

Wave 1 

Wave 2 


Yi 

Y 2 , IT = 1 {Nrp) 

X 

¥ 2 = 1 , W = 0 (Nip) 


Yi=? 

Y 2 , W =? 
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As an example of an AN model, suppose Yi and I 2 are binary variables 
and X is empty, as in Hirano et al. (1998). One specification of the additive 
non-ignorable selection model is 

(3.2) Yii ~ Bern(7ri), logit(7ri) = oq 

(3.3) Yi2 I Yu ~ Bern(7rj2), logit(7rj2) = /3o + PiYu 

(3.4) Wi I Yii,Yi2 ~ Bern(7rjw), logit(7rjw) = tq + tiYu + T2Yi2. 

For a pattern mixture model representation, we require a model for [W \ 
X) and for (Yi, Y 2 \ X, VF), that is, 

IF I A ~ /(VF|A,0) 

{Yi,Y2)\X,W ~ /(yi,Y2|A,lF,0). 

Using the basic example, one specification of the additive pattern mixture 
(APM) model is 

Wi ~ Bern(7rw), logit(7rw) = <^o 
Yu I Wi ~ Bern(7rii), logit(7rii) = (5o + 5iWi 

(3.5) Yi 2 I Yu, Wi ~ Bern(7rj2), logit(7ri2) = 70 + + l 2 Yu, 

which contains as many free parameters as in (3.2) - (3.4) and thus is data- 
identified. To enable identihcation, we exclude interactions between Yi and 
VF in (3.5). We note that both the AN and APM models can include inter¬ 
actions with X and readily extend to other data types. 

4. Bayesian Latent Pattern Mixture Models. We now develop an 
APM model for categorical data with q = qo + qi + q 2 variables. Let Z = 
(X,Yi,Y 2 ) = {Zi,..., Zq) comprise all potentially collected variables. We 
order variables so that j = 1,... ,qo foic X variables, j = qo + 1,..., qo + qi 
for Yi variables, and j = q^ + qi + 1,..., q iov Y 2 variables. For i = 1,..., A 
and j = 1,..., g, without loss of generality let Zij G {1,..., dj} denote the 
level of variable j for unit i, where dj > 2 is the total number of levels for 
variable j. 

We specify the pattern mixture model as f{W)f{Z \ W), including X 
in the joint distribution of the survey variables. This facilitates imputa¬ 
tion of (ignorable) item nonresponse in X, and allows us to take advantage 
of computationally efficient latent class representations of categorical data. 
Specifically, we adapt the truncated Dirichlet process mixture of products of 
multinomial distributions (DPMPM) developed by Dunson and Xing (2009), 
used previously for multiple imputation of missing cross-sectional data by Si 
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and Reiter (2013). The DPMPM assumes that each individual is a member 
of a latent class, and that within each class the variables follow independent 
multinomial distributions. Averaging the multinomial probabilities over the 
latent classes induces global dependence among the variables. 

For i = 1,...,^", let s* G {1,...,A'} indicate the class of individual 
and let vr/j = Pr(sj = h) where h = We assume that vr = 

(vTi,..., ttk) is the same for all individuals. For j = qo + 1,... ,qQ + qi, let 
iphjz = = z\si = h) be the probability of Zij — z for any value z given 

that individual i is in class h. For j = 1,...,go and j = qQ + qi + 1,... ,q,\ei 
'^hjz = = h) and V’iji = = z\Wi = 0,Si = h) 

be the probabilities of Zij = z for any value 2 ; given that individual i is in 
class h for each value of Wj. The complete-data likelihood for (sj, Wi,Zi) in 
the BLPM is as follows. 

(4.1) Sj I TT ~ Multinomial(7ri,..., t:k) 

(4.2) Wj I Sj ~ Bernoulli(psJ. 

When j G {go+1, • ■ •, Qo+Qi}, we have 

(4.3) Zij I Si ~ Multinomial({l,..., dj}, , i’sijdj)- 

When j G {l,...,qo,qo + qi + l,..., g}, we have 

(4.4) Zij \ Si,Wi = l ^ Multinomial({l, ...,dj}, ..., ) 

(4.5) Zij I Si, Wi = 0 ~ Multinomial({l,..., dj}, • • •, V’l°]<ij )- 

The BLPM model is a mixture of pattern mixture models, where 
K 

fiZi, W,) = Pr(si = h)f{Wi\si = h)fiZi\Wi, Si = h). 

h=l 

As in the DPMPM, we assume that (Z^o+i,..., Zg^+gj), that is, Yi, fol¬ 
low independent, class-specific multinomial distributions that are also inde¬ 
pendent of W (and X,Y 2 ). However, we depart from the DPMPM by let¬ 
ting (Zi,..., Zgp, ZgQ_|_q^+i,..., Zq) follow class-specific, independent multi¬ 
nomial distributions that depend on W. Relaxing the conditional indepen¬ 
dence between Y 2 and W (that is, Y 2 is independent of W within any latent 
class) is possible because of information offered by the refreshment sam¬ 
ple. We force Yi and W to be independent within latent classes to enable 
identification, following the strategy outlined in Section 3. We allow X to 
depend on W within classes to offer additional flexibility for settings where 
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the distributions of X are substantially different across attriters and non- 
attriters. When this is not the case—the distributions of X are observed for 
both W = 1 and W = 0—one can specify the model so that X does not 
depend on W within classes, thereby reducing the number of parameters to 
estimate. 

For the prior distribution on vr, we use the stick-breaking representation 
of a Dirichlet process prior distribution (Sethuraman, 1994), truncating at 
large K for computational convenience. In particular, we have 

(4.6) 7rh = Vhl[{l-Vg) 

g<h 

(4.7) Vh ~ Beta(l, a), for /i = 1,..., iF — 1, and Vk = 1 

(4.8) a ~ Gamma(aa, ba)- 

We use uniform prior distributions on all V’ and p parameters. We follow 
Dunson and Xing (2009) and Si and Reiter (2013) and set aa = ba = 0.25. 
Setting tta + ba = 0.5 represents a small prior sample size and hence vague 
specification, thereby allowing the data to dominate the cluster allocations. 
In our simulations and the APYN analyses, results are not sensitive to rea¬ 
sonable default choices of {aa,ba)- We estimate the model using a blocked 
Gibbs sampler (Ishwaran and James, 2001); see the online supplement for 
an outline of the algorithm. 

We set K to be large enough to help the DPMPM to describe the joint 
distribution reasonably well yet still offer fast computation. Using an initial 
proposal for K, say K = 20, analysts can examine the posterior distributions 
of the number of classes with at least one assigned observation across Markov 
chain Monte Garlo (MCMG) iterates to diagnose if K is large enough. When 
there is significant posterior mass at a number of classes equal to K, the 
analyst should add more classes. The analyst can repeat this diagnostic 
procednre until finding a suitable K. We note that the posterior predictive 
distributions used to generate imputations typically are very similar for any 
sufficiently large K. 

The usual truncated DPMPM model is based on (4.1)-(4.8) but requires 
that V'ijc. = ™ (4-5) for all {h,j,Cj). This implies that all 

Z are independent of W within classes, which may not be the case. The re¬ 
freshment sample offers information that allows us to relax this assumption, 
particularly for I 2 . Intuitively speaking, the refreshment sample offers infor¬ 
mation about /(T 2 I s), and the complete cases in the panel offer information 
about f{Y 2 \ s,W = 1). These two distributions identify /(Y 2 \ s,W = 0). 
Without the refreshment sample, we do not have information to differentiate 
/(F 2 I s,IU = 0) and /(T 2 \ s,W = 1); as a consequence, we are forced to 
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make the unverifiable assumption of conditional independence between Y 2 
and W. In Section 5, we present simulation studies that illustrate the bi¬ 
ases that can result in when falsely assuming the conditional independence 
assumption. 

The model can be used for posterior inference or for multiple imputation. 
For the latter, analysts select m of the sampled completed datasets after 
convergence of the Gibbs sampler. These datasets should be spaced suffi¬ 
ciently so as to be approximately independent. This involves thinning the 
MCMC samples so that the autocorrelations among parameters are close 
to zero. Multiple imputation inferences then can be based on all N units 
in the concatenated data. Alternatively, as discussed in Deng et al. (2013), 
some statistical agencies or data analysts may prefer to disseminate or base 
inferences on only the original panel after using the refreshment sample for 
imputing the missing values due to attrition. This might be preferable when 
combining the original and freshened samples complicates interpretation of 
sampling weights and design-based inference. Additionally, using only the 
Np completed panel cases reduces sensitivity of inferences to the specifica¬ 
tion of the multiple imputation model, which enters the analysis only for 
completing Y 2 for the attriters. As pointed out by reviewers of this arti¬ 
cle, survey-weighted analyses of the multiply imputed data can result in 
biased estimates of variance (Kim et al., 2006). This can result from lack 
of congeniality (Meng, 1994a) of the imputation model and survey-weighted 
analysis. 

5. Simulation Studies. In this section, we present results of simula¬ 
tion studies that illustrate the potential of the BLPM model to account 
for non-ignorable attrition. We use two data generation mechanisms: one in 
which Y 2 and W are not independent within classes, and one in which they 
are independent within classes. We compare the performance of the BLPM 
model to the usual DPMPM, a model that assumes I 2 and W are condition¬ 
ally independent. In each scenario, we set Np = 2,000 and Nr = 1,000. Each 
wave includes = ^2 = 5 binary variables; for simplicity, we do not include 
any X variables. Table 3 displays the values of vr and the ■0 parameters 
for each scenario. These designs result in non-trivial dependence structures; 
for example, we ran Pearson’s chi-square tests in the true datasets and re¬ 
jected independence at the 0.05 significance level for 29 out of the 45 paired 
combinations among the 10 variables. 

In each replication of the simulation, we generate a dataset with values 
of {Z, W) for all N = 3, 000 records; we call this the true data. We delete 
the values of Y 2 for all records in the panel with Wi = 0 and the values of 
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Table 3 

Latent class and marginal probabilities for simulations. The first five ip parameters 
correspond to Y\j variables, and the last five ip parameters correspond to Y 2 j variables. 
The columns labeled “marginal” are the weighted averages of ip over the latent classes. 


Parameter 

h^l 

Y 2 and W not Cond. Ind. 
h=2 h=Z 

Marginal 

Y 2 and W are Cond. Ind. 
h=l h=2 h=3 

Marginal 

TT 


0.4 

0.3 

0.3 

- 

0.4 

0.3 

0.3 

- 

Ph 


0.80 

0.95 

0.60 

0.78 

0.80 

0.95 

0.60 

0.78 



0.25 

0.55 

0.85 

0.52 

0.25 

0.55 

0.85 

0.52 

1ph.2.1 


0.20 

0.50 

0.80 

0.47 

0.20 

0.50 

0.80 

0.47 



0.15 

0.45 

0.75 

0.42 

0.15 

0.45 

0.75 

0.42 

1ph,4,l 


0.10 

0.40 

0.70 

0.37 

0.10 

0.40 

0.70 

0.37 

Iph.S.l 


0.05 

0.35 

0.65 

0.32 

0.05 

0.35 

0.65 

0.32 


' ^h,6,l 

0.76, 0.38 

0.46, 0.58 

0.16, 0.78 

0.49, 0.56 

0.38, 0.38 

0.58, 0.58 

0.78, 0.78 

0.56 

Vh,7,l 

’ ^h-,7,1 

0.77, 0.41 

0.47, 0.61 

0.17, 0.81 

0.50, 0.59 

0.41, 0.41 

0.61, 0.61 

0.81, 0.81 

0.59 



0.78, 0.44 

0.48, 0.64 

0.18, 0.84 

0.51, 0.62 

0.44, 0.44 

0.64, 0.64 

0.84, 0.84 

0.62 

*Ph,9,l 

> Yh,9,l 

0.79, 0.47 

0.49, 0.67 

0.19, 0.87 

0.52, 0.65 

0.47, 0.47 

0.67, 0.67 

0.87, 0.87 

0.65 

*^11.10.; 

1’ ‘f^/i.10.1 

0.80, 0.50 

0.50, 0.70 

0.20, 0.90 

0.53, 0.68 

0.50, 0.50 

0.70, 0.70 

0.90, 0.90 

0.68 


(Yi, W) for all records in the refreshment sample. The resulting dataset 
has the structure in Table 2 without X. We fit the BLPM and DPMPM 
models using the Gibbs sampler, imputing Y 2 in the panel when W = 0 
and (Yi,W) in the refreshment sample in each MCMC iteration. For each 
scenario, we run 100 independent replications of the simulation. 

To evaluate the potential of the BLPM and DPMPM models to correct 
for attrition, as well as to compare them with each other, we focus primarily 
on the completed data estimates of Pr(y 2 = 1) in the panel. Let superscript 
r = 1,..., 100 index replications of the simulation, and let superscript t = 
1,..., T index MCMC iterations, where T is the number of MCMC iterations 
used in computation. For all (r, t), and for z = 1,..., iV and j = 1,... ,q, 
let zpj ^ be the value of Zij in replication r and MCMC iteration t. Here, if 
j > qi, is an observed value for all panel cases with = 1 and is 

(r) 

an imputed value for all cases with ' = 0. For any variable indexed by 
j > qi, we compute 

Np 

-{rt) ^ ^ = 1)/Np, z^ = Median ■ ■ ■, 

i=l 


Let be the value of = 1) for the panel in the true data asso¬ 

ciated with replication r. We then compute 

100 

DIFj = I VlOO - I 

r=l 
/lOO 

RMSEj = -zj’'’‘™^V/100 

\r=l 
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W Y11 Yi2 Yi 3 Yi 4 Yi 5 Y21 Y22 Y23 Y24 Y2f 


Fig 1. Simulation results when the data are generated with Y 2 and W dependent within 
class. Results for DPMPM displayed with triangles and for BLPM with circles. 


The larger DIFj and RMSEj, the more inaccurate are the completed-data 
estimate in the panel. We use only the panel and not the concatenated data 
to magnify the impact of the models on imputation of the missing data 
due to non-ignorable attrition. We also report values of DIFj and RMSEj 
for the BLPM and DPMPM models for the means of W and Yi in the 
refreshment sample. These are both fully imputed in the models. 

For each simulation run, we run MCMC chains for both models with K = 
10 classes—we obtained very similar results with K = 20 and K = 30. We 
run the chains for 20,000 and 30,000 iterations for the BLPM and DPMPM 
models, respectively, which exploratory runs suggest as sufficient for the 
chains to converge. We keep every tenth draw from the final 10,000 draws 
of each chain, leaving T = 1,000 MCMC draws for inference. To initialize 
the chains, for all h we set ph = Ncp/Np; set all cj) parameters equal to 0.5; 
set a = 1; and, generate all K-1 initial values of 14 from (4.7) using a = 1. 

Figure 1 summarizes the values of DIFj and RMSEj for each quantity 
for both the BLPM and DPMPM models for the simulation with conditional 
dependence between 14 and W within classes. We also computed the DIFj 
and RMSEj when estimating each Pr(y 2 j = 1) in the panel with only the 
complete panel cases. For this complete-cases estimator, the average values 
of DIF and RMSE across the 100 runs are shown in Table 4. 

Compared to the results in Table 4, the BLPM and DPMPM tend to 
offer smaller differences in point estimates, correcting the bias in complete- 
case analysis due to attrition. When estimating Pr(y 2 j = 1) using the panel 
data alone, the BLPM tends to be more accurate than the DPMPM. The 
relative performance of the DPMPM worsens as the magnitude of the at- 
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Table 4 

Simulation results for the complete-cases estimator when the data are generated with Y 2 

and W dependent within class. 


Pv{Y2i = 1 ) 

3 = 1 

i = 2 

i = 3 

j = 4 

j = 5 

DIF 

0.031 

0.033 

0.039 

0.042 

0.046 

RMSEi 

0.031 

0.033 

0.039 

0.043 

0.047 


trition bias increases, where by attrition bias we mean the difference in 
the marginal probabilities of Y 2 j for non-attriters and attriters, that is, 
J2h'^h'4’^ji — tend to see better performance when pre¬ 

dicting the missing W and Yi in the refreshment sample, although the 
gaps are not as noticeable as those for Y 2 . For all j > qi, the simulated 
matched pair standard errors are around 0.003 for comparing DIFj for 
BLPM and DPMPM, and around 0.005 for comparing DIFj for BLPM 
and the complete-case estimator. 

Figure 2 summarizes the values of DIFj and RMSEj for each quantity 
for both the BLPM and DPMPM models for the simulation with conditional 
independence between Y 2 and W within classes. For the complete-cases es¬ 
timator, across the 100 runs, the average values of {DIFi,... ,DIF^) all 
equal approximately 0.016 with associated {RMSEi ,..., RMSE^) equal to 
approximately 0.017. Once again, the BLPM and DPMPM tend to esti¬ 
mate each Pr(y 2 j = 1) using the panel data alone more accurately than the 
complete-case analysis. When estimating Pr(y 2 j = 1) using the panel data 
alone, the DPMPM tends to be slightly more accurate than the BLPM, but 
the differences are modest when compared to those in Figure 1. The differ¬ 
ences stem from estimating additional parameters in the BLPM, whereas the 
DPMPM has the exact specihcation. For all j > qi, the simulated matched 
pair standard errors are around 0.002 when comparing DIFj for BLPM and 
DPMPM, and 0.002 when comparing DIFj for BLPM and the complete-case 
estimator. 

In summary, these simulation results suggest that both the BLPM and 
DPMPM can reduce attrition bias compared to using the complete cases. 
The BLPM is more flexible than the DPMPM in that it can protect against 
failure of the conditional independence assumption for Y 2 and W. However, 
when conditional independence holds, the BLPM estimates can be similar to 
those based on the DPMPM. A sensible default position with decent sample 
sizes is to use the BLPM, since the data do not inform whether conditional 
independence is appropriate. 

In our experience, in modest sample sizes both the BLPM and the DPMPM 
can suffer, as the latent class models will sacrifice higher-order relationships 
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Fig 2. Simulation results when the data are generated with Y 2 and W independent within 
class. Results for DPMPM displayed with triangles and for BLPM with circles. 


among the variables. Thus, it is crucial to check the fit of the models. We 
suggest methods for doing so in the analysis of the APYN data (Section 6). 

6. Using the BLPM to Correct for Attrition in the APYN data. 

We now apply the BLPM model to account for attrition in the APYN data. 
To begin, we first provide some additional context on the survey design 
that is relevant for our imputations and analyses. Throughout, we refer 
to cross-sectional unit nonresponse as non-participation or refusal in the 
wave when an individual is initially surveyed; attrition happens when an 
individual drops out after participating in a previous wave. For example, 
the refreshment sample is subject to cross-sectional unit nonresponse but 
not attrition, as these individuals are only surveyed at wave 2. 

6.1. Survey weights in the APYN. The APYN data file includes sur¬ 
vey weights at each wave. The wave 1 weights are the product of design- 
based weights and post-stratification adjustments for cross-sectional unit 
nonresponse at wave 1. These post-stratification adjustments assume the 
cross-sectional unit nonresponse is missing at random, as is common in the 
literature (e.g., Hirano et ah, 1998; Bhattacharya, 2008; Das, Toepoel and 
van Soest, 2011). The wave 2 weights for the 1724 panel participants include 
post-stratification adjustments for attrition in the panel, for cross-sectional 
unit nonresponse at wave 1, and for cross-sectional unit nonresponse among 
cases in the refreshment sample; the way that weights are reported does not 
allow us to disentangle these adjustments. Since we use the BLPM model to 
account for non-ignorable attrition, we disregard the wave 2 weights in all 
analyses. 
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The original panel is approximately an equal probability sample, with 
deviations due primarily to (i) slight oversampling of African American and 
Hispanic telephone exchanges and (ii) undersampling of areas where the 
MSN TV service network cannot be used and where there is no access to the 
internet. The post-strata in wave 1 are based on gender, race, the age groups 
in Table 1, the education groups in Table 1, census region, metropolitan 
area, and household internet access. We include most of these variables in 
the BLPM model, thereby accounting for important aspects of the design 
when making imputations. The geographic variables and internet access are 
not strong predictors of Obama favorability given all the other variables in 
Table 1. In a logistic regression with Obama favorability in wave 1 as the 
dependent variable, a drop in deviance test for the models with and without 
census region, metropolitan area, and internet access (including all other 
variables in X) results in a p-value of 0.20.^ Since these variables do not 
substantially improve our ability to predict the missing Obama favorability 
values, and are not of substantive interest in our analyses of the American 
electorate, we exclude them from the imputation model. 

We use unweighted analyses to illustrate the attrition effects and describe 
the behavior of the BLPM model (as in Figures 3 and 4 in Section 6.3), and 
we use survey weighted analyses when computing finite population quantities 
(as in Figure 5 in Section 6.3). The survey-weighted estimates account for 
the sampling design and cross-sectional unit nonresponse in wave 1 only. To 
make these estimates, we use the wave 1 weights for the 1724 panelists in 
multiple imputation inferences (Rubin, 1987). 

6.2. Generating Completed Datasets. We run the BLPM with K = 30 
classes using the Gibbs sampler outlined in the online supplement, treat¬ 
ing Obama favorability as (Yi,!^) and all other variables as X. As initial 
values for W in the refreshment sample, we use independent draws from 
a Bernoulli distribution with probability N^p/Np = 0.63. For missing data 
in (X,Yi,Y 2 )—due to item nonresponse and attrition—and W in the re¬ 
freshment sample, we implement the initialization steps of the MCMC as 
follows. 

• For any missing values in X, sample from the marginal distribution of 
X computed from the observed cases in the combined panel and the 
refreshment sample. 

• For any missing values in Yi, sample from the observed marginal dis¬ 
tribution of li. 

^We estimated the model with wave 1 data to avoid any issues from non-ignorable 
attrition. 
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• For missing values in Y 2 in the refreshment sample, sample from the 
observed marginal distribution of Y 2 in the refreshment sample. 

• For missing values in Y 2 in the panel for cases with Wt = 1, sample 
from the observed marginal distribution of I 2 in the panel. 

• For missing values in I 2 in the panel for cases with Wi = 0, sample from 
independent Bernoulli distributions with probabilities Pr(l 2 |hF = 0), 
obtained by [Pr(y 2 ) - Pt{Y 2 \W = 1)Pt{W = 1)]/Pt{W = 0). Here, 
Pr(l 2 ) is estimated with the refreshment sample, Pt(Y 2 \W = 1) is 
estimated with cases with VFj = 1 in the panel, and Pr(iy = 1) = 0.63. 

For the initial values of the parameters, we set a = 1; set each ph = 
^cp/Np] set each parameter equal to the corresponding marginal prob¬ 
ability calculated from the initial completed dataset; and set 14 = 0.1 for 
h = 1,..., K-1. Each record’s latent class indicator is initialized from a draw 
of a multinomial distribution with probability vr implied by the set of initial 

m- 

We run the MCMC for 150,000 iterations, treating the first 100,000 as 
burn-in and thinning every 50th iteration. The trace plots of each variable’s 
marginal probability suggest convergence. The posterior mode of the number 
of distinct occupied classes is 9, and the maximum is 18. This suggests that 
K = 30 classes is sufficient. We collect m = 50 completed datasets by keeping 
every twentieth draw from the T = 1000 thinned draws. We use only the Np 
records in the completed panels for multiple imputation inferences. 

6.3. Results. We begin by comparing the distributions of variables in 
wave 2 among the Wp non-attriters in the panel and the W respondents in 
the refreshment sample; these are summarized in Table 5. Among the non- 
attriters, 54.9% favor Obama. In the refreshment sample, however, 61.7% 
favor Obama. This suggests that people who liked Obama may have dropped 
out with higher frequency than those who did not. As a sense of the mag¬ 
nitude of these differences, the 95% confidence interval limits corresponding 
to these two percentages are (0.525, 0.573) and (0.572, 0.662), offering evi¬ 
dence that the difference may well be systematic. Of note, compared to the 
refreshment sample, the Ncp non-attriters are less likely to be Democrats 
and to be liberals, more likely to be non-white and to have income below 
$30,000, and more likely to be below age 45. 

These differences in the marginal frequencies reflect the effects of attri¬ 
tion, as well as differential cross-sectional unit nonresponse in the refresh¬ 
ment sample and initial wave. Reassuringly, national cross-sectional polls 
in October 2008 from Gallup, Fox News, and other major polling orga¬ 
nizations also put Obama favorability ratings close to 62% (http://www. 
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Table 5 

Unweighted percentages of respondents in each category in wave 1 and wave 2 of the 
panel (W1 and W2), and in the refreshment sample (Ref). Percentages based on 
available cases only, before imputation of item nonresponse. 


Variable 

W1 

W2 

Ref. 

Favorable to Obama 

0.553 

0.549 

0.617 

Democrat 

0.327 

0.318 

0.374 

Independent 

0.369 

0.374 

0.312 

Liberal 

0.223 

0.234 

0.289 

Conservative 

0.366 

0.370 

0.397 

Age 18-29 

0.148 

0.135 

0.110 

Age 30-44 

0.284 

0.284 

0.213 

Age 45-59 

0.317 

0.320 

0.341 

HS Edu. or less 

0.343 

0.325 

0.323 

College Edu. 

0.298 

0.333 

0.308 

Non-white 

0.230 

0.220 

0.177 

Female 

0.548 

0.537 

0.565 

Income < 30K 

0.277 

0.262 

0.170 

Income 30-50K 

0.269 

0.270 

0.306 

Income 50-75K 

0.225 

0.235 

0.211 

Married 

0.631 

0.632 

0.647 


pollingreport. com/obama_f av. htm), suggesting the respondents in the re¬ 
freshment sample faithfully represent Obama’s favorability ratings at the 
time. In our analyses, we assume that Obama favorability values missing 
for reasons other than attrition, that is, due to cross-sectional item and 
unit nonresponse, are MAR given the variables in the BLPM model. Previ¬ 
ous survey methodology research indicates that missingness mechanisms for 
attrition and cross-sectional nonresponse are distinct (e.g., Loosveldt and 
Carton, 1997; Groves and Couper, 1998; Lynn, 2005; Groves, 2006; Smith 
and Son, 2010; Olson and Witt, 2011), so that one can plausibly consider 
attrition as potentially non-ignorable even when assuming cross-sectional 
unit nonresponse is MAR. See Schifeling et al. (2014) for further discussion 
of the effects on inferences of non-ignorable cross-sectional unit nonresponse 
in the initial wave and refreshment sample. 

Figure 3 displays estimated probabilities for Obama favorability for each 
of the subgroups defined by the time-invariant variables. For many sub¬ 
groups, the estimates for non-attriters in the panel are noticeably different 
from those in the refreshment sample. This finding offers an important cor¬ 
rection to the prevailing wisdom about the nature of panel attrition in polit¬ 
ical surveys. Research had previously concluded that attrition bias impacted 
outcomes related to political engagement (e.g., turnout) but not those re¬ 
lated to candidate support (e.g., favorability) (Bartels, 1999; Kruse et ah. 
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Non-Married (635,1010). 
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Fig 3. Point estimates and 95% confidence intervals for Obama favorability in various 
subgroups. Results presented for the Ncp panel non-attriters, the Nr refreshment samples, 
and the Np panel participants. Inferences based on unweighted analyses of the m = 50 
completed datasets, after multiple imputation of missing values via the BLPM model. The 
numbers in parentheses are the corresponding subgroup sizes, the first being the size among 
non-attriters and the second being among the completed panel. We randomly select one 
imputed dataset to obtain the sample sizes when the background variables are subject to 
item nonresponse. 


2009). The attrition biases within these subgroups provide evidence to the 
contrary. It is also noteworthy that the differences are most pronounced 
for women, low-income respondents, respondents aged 45-59, the least ed¬ 
ucated, and political independents. Many of these are the sub-populations 
often thought to lack a voice in American politics (Gilens, 2005), and these 
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results suggest that panel attrition may further complicate accurate estima¬ 
tion of their political attitudes and preferences. 

Figure 3 also reveals how the BLPM can correct for attrition bias. In 
particular, for most subgroups, the point estimate for the Np panel par¬ 
ticipants is shrunk towards the refreshment sample estimate; that is, the 
BLPM model corrects the bias due to attrition. The BLPM-corrected in¬ 
tervals tend to be wider than those computed with the non-attriters. This 
results from two sources of variability, namely the estimation of the model 
parameters based on a modest-sized refreshment sample and the imputation 
of the Nip = 1, 011 values of Y 2 . 

Figure 4 displays inferences for several smaller subgroups of substantive 
interest. Here, the BLPM’s advantage over AN models is particularly pre¬ 
scient, as we are able to ht the BLPM model without having to specify 
(perhaps arbitrarily) a selection model with interaction effects. The attrition 
biases do appear to differ across the groups, suggesting the importance of 
using models that can capture interaction effects. Interestingly, high-income 
males appear not to experience substantial attrition bias, whereas various 
low-income and less educated groups appear to experience sizable underes¬ 
timations of Obama favorability. As in Figure 3, for most groups the BLPM 
generally shrinks point estimates towards those in the refreshment sample. 

Of course, evaluating potential attrition bias is not the end goal of our 
analyses. Rather, having created attrition-adjusted imputations with the 
BLPM model, we now use the m completed panel datasets to better un¬ 
derstand the American electorate during the 2008 campaign. Here, we use 
survey-weighted analysis as follows. For each population percentage of inter¬ 
est and in each of the m completed panel datasets, we compute the standard 
ratio estimate of the population percentage and the usual estimated variance 
based on the formula for unequal probability sampling with replacement 
(Lohr, 1999). We obtain estimates with the survey package (Lumley, 2012) 
in R. We then combine the point and variance estimates using the multiple 
imputation rules (Rubin, 1987). 

Accounting for the wave 1 survey weights, the marginal estimate for 
Obama favorability in the last days before Election Day (wave 2) was 0.615 
(0.576,0.655), indicating Obama enjoyed the level of candidate support nec¬ 
essary to win the November election. As can be seen in Figure 5, Obama en¬ 
joyed higher levels of favorability among some expected subgroups—liberals, 
non-whites, and Democrats—in the weighted analysis for both waves. His 
high levels of favorability among other subgroups, especially moderates and 
Independents, offers the clearest signal of the likely election outcome. It was 
only among self-reported Republicans and conservatives that Obama found 
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Fig 4. Point estimates and 95% confidence intervals for Obama favorability in additional 
subgroups. Results presented for the Ncp panel non-attriters, the Nr refreshment samples, 
and the Np panel participants. Inferences based on unweighted analyses of the m = 50 
completed datasets, after multiple imputation of missing values via the BLPM model. The 
numbers in parentheses are the corresponding subgroup sizes, the first being the size among 
non-attriters and the second being among the completed panel. We randomly select one 
imputed dataset to obtain the sample sizes when the background variables are subject to 
item nonresponse. 


favorability levels fall below 0.5. 

Comparing estimates across waves also suggests that the American elec¬ 
torate grew more favorable towards Obama as the campaign unfolded—the 
average marginal favorability in wave 1 is 0.569 (0.542,0.597), as illustrated 
in Figure 5. The increase in marginal favorability rating across waves is 0.046 
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(0.003,0.089). In terms of attitude changes during the campaign among the 
various subgroups, most became slightly more favorable over time, with the 
exception of conservatives and Republicans who became slightly less favor¬ 
able from wave 1 and wave 2. Most of these changes are not statistically 
significant due to sample size issues. The statistically significant changes in 
attitudes are among Democrats, liberals, moderates, less educated, individ¬ 
uals with middle income, and males, who showed substantial increases in 
favorability towards Obama between wave 1 and wave 2. Overall, these pat¬ 
terns suggest that the partisan polarization in evaluations of Obama that 
characterize American politics today actually started during the 2008 pres¬ 
idential campaign (Burden and Hillygus, 2009). 

We also fit the BLPM model assuming that Yf and X are conditionally 
independent of W within latent classes. Reassuringly, the conclusions from 
this version of the BLPM are similar to those presented previously. 

For comparison, we fit two additional models: the DPMPM model de¬ 
scribed in Section 5 that does not have I 2 depend on VF, and a MAR im¬ 
putation model based on the DPMPM (as in Si and Reiter, 2013) that 
disregards W entirely. The results for both models, reported in Section 3 
of the online supplement, are similar to each other but different from the 
BLPM results. These two alternative models generally result in point es¬ 
timates quite similar to those from the non-attriters; in other words, they 
suggest that panel attrition bias in Obama favorability is ignorable. This 
seems implausible given the differences in Obama favorability seen in the 
non-attriters and the refreshment samples. 

We also fit the semi-parametric AN model of Si, Reiter and Hillygus 
(2014), which assumes a probit regression for W conditional on (X,Yi,Y 2 ) 
and a DPMPM model for {X,Yi,Y 2 ). Results are reported in Section 4 of 
the online supplement. Both the semi-parametric AN and BLPM models 
suggest that the attrition is non-ignorable. Point estimates for the quanti¬ 
ties in Figure 3 and 4 differ slightly; however, the differences are modest 
relative to the multiple imputation variances. We prefer the BLPM results, 
as the model diagnostics of Section 6.4 suggest that the BLPM fits the data 
more effectively than the semi-parametric AN model. We further note that 
the semi-parametric AN model is computationally more intensive than the 
BLPM, as the probit regression for W requires auxiliary data augmentation 
and Metropolis steps that are not necessary in the BLPM. 

6.4. Model Diagnostics. To check the fit of the models, we follow the 
advice in Deng et al. (2013) and use posterior predictive checks (Meng, 
1994b; Gelman et ah, 2005; He, Zaslavsky and Landrum, 2010; Burgette 
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Fig 5. Dynamics of Obama favorability ratings between wave 1 and wave 2. Top plot 
compares the marginal estimates in wave 1 and wave 2. Bottom plot presents the differences 
between wave 2 and wave 1. Results based on the Np panel participants after multiple 
imputation via the BLPM model. Inference based on survey-weighted estimation. 
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and Reiter, 2010). We use the BLPM model to generate = 500 data sets 
with no missing data in {X, ¥±,¥ 2 , W), randomly sampling from the T=1000 
available completed datasets. Let be the collection of the 

completed datasets. For each we also use the model to generate 
new values of ¥2 for all cases in the panel, including cases with Wi = 1, and 
in the refreshment sample. This can be done after running the MCMC to 
convergence as follows. For given draws of parameter values and any item 
missing data in {X, ¥i ), sample new values for the observed and imputed ¥2 
using the distributions in the online supplement. Let ..., be 

the collection of the replicated datasets. 

We then compare statistics of interest in ..., to those in 

{D^^\ ..., Specifically, suppose that S is some statistic of inter¬ 

est, such as a marginal or conditional probability in our context. For t = 
1,..., let and Sij(t) be the values of S computed from and 
respectively. We compute the two-sided posterior predictive probability. 


2^0 




PPP= ;^*min XI -Som >0),X^(‘5'dW “ %*) > 0) | . 

^ \t=i t=i 


When the value ppp is small, for example, less than 5%, this suggests the 
replicated datasets are systematically different from the observed dataset, 
with respect to that statistic. When the value of ppp is not small, the im¬ 
putation model generates data that look like the completed data for that 
statistic. Recognizing the limitations of posterior predictive probabilities 
(Bayarri and Berger, 1998), we interpret the resulting ppp values as diag¬ 
nostic tools rather than as evidence from hypothesis tests that the model is 
“correct.” 

As statistics, we select Pr(y 2 = 1) in the refreshment sample, Pr(y 2 = 
1 I IF = 1) in the panel, Pr(yi = 1, F 2 = 1 | bF = 1) in the panel, and 
Pr(l 2 = 1 I -A, IF = 1) in the panel for all conditional probabilities involved 
in the subgroup analyses in Figure 3 and 4. This results in 38 quantities of 
interest. A histogram of the 38 values of ppp is displayed in Section 4 in the 
online supplement. The analysis does not reveal any serious lack of model 
fit as none of the ppp values are below 0.20. 

We repeat the same model diagnostics on the semi-parametric AN model 
of Si, Reiter and Hillygus (2014). Many of posterior predictive probabilities 
are uncomfortably small. We believe the differences in the semi-parametric 
AN and BLPM models result because the predictor function in the AN model 
for IF used by Si, Reiter and Hillygus (2014) includes only main effects, 
whereas the BLPM model does not a priori enforce a model for attrition. 
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7. Concluding Remarks. The proposed Bayesian latent pattern mix¬ 
ture model offers a flexible way to leverage the information in refresh¬ 
ment samples in categorical datasets, helping to adjust for bias due to non- 
ignorable attrition. We have used this approach in analyzing the APYN 
study to better understand the preferences of the American electorate dur¬ 
ing the 2008 presidential campaign. Our findings suggest that panel attri¬ 
tion biased downward estimates of Obama favorability among many sub¬ 
groups in the electorate. With a more accurate assessment of voter atti¬ 
tudes, we find that Obama had sufficiently high levels of favorability among 
key subgroups—independents and moderates—to suggest that the election 
outcomes were not really in doubt by late October. 

The BLPM approach has key advantages over existing applications of 
additive non-ignorable models. The BLPM avoids the difficult tasks of spec¬ 
ifying a binary regression model for the attrition process. Unlike standard 
latent class models, the BLPM fully utilizes the information in the refresh¬ 
ment sample by allowing for conditional dependence within latent classes 
between wave 2 variables and the attrition indicator. We note that a wide 
range of existing surveys have data structure amenable to BLPM modeling, 
including the General Social Survey, the 2008 American National Election 
Study, the Survey of Income and Program Participation, and the National 
Educational Longitudinal Study, to name just a few. 

As with other modeling strategies for refreshment samples, the validity 
of the BLPM depends on several overarching assumptions. Eirst, the initial 
wave of the panel and the refreshment sample should be representative of 
the same population of interest. Put another way, the units in the target 
population should not change substantially between wave 1 and wave 2, al¬ 
though certainly the distributions of the substantive variables can do so. 
Second, any unit (or item) nonresponse other than that due to attrition is 
missing at random. Third, to ensure identifiability, we assume conditional 
independence between wave 1 survey variables and the attrition indicator 
within classes. When this assumption is unreasonable, the BLPM model— 
and any additive pattern mixture model—could fail to correct for attrition 
bias. Unfortunately, the data do not provide information about the plausi¬ 
bility of this assumption. Methods for assessing the sensitivity of results to 
violations of this assumption, as well as to violations of the two representa¬ 
tiveness assumptions, are important areas for research. 
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SUPPLEMENTARY MATERIAL 

Supplement A: Bayesian Latent Pattern Mixture Models for 
Handling Attrition in Panel Studies With Refreshment Samples 

(doi: COMPLETED LATER BY THE TYPESETTER). The supplement 
includes the MCMC algorithms for the BLPM and DPMPM models, ad¬ 
ditional analyses of the APYN data using the DPMPM model and semi- 
parametric AN model, and details of the BLPM model diagnostics. 
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